Allow stream listener to work on any type #8833

chenmoneygithub · 2025-09-20T01:07:30Z

Previously we limit stream listeners to only work with str field and some prebuilt types. This PR lifts up the constraint, and allow streaming on any types. Here is the gist:

StreamLIstener works the same way as before, i.e., only capture streaming chunks associated with a certain field.
We don't perform additional handling on the structured output value for each field, but just give back the value.

To accomodate JSONAdapter, which doesn't have clear boilerplate like [[ ## answer ## ]] that splits the fields' streaming in ChatAdapter, we make specific logic chunk based on jiter to handle it.

TomeHirata · 2025-10-20T13:37:41Z

dspy/streaming/streaming_listener.py

+                # Other adapters rely on the end_identifier to detect the end of the field we are listening to.
+                return self._default_handle_stream_chunk(token, end_identifier)
+
+    def _json_adapter_handle_stream_chunk(self, token: str, chunk_message: str) -> str:


The return signature should be StreamResponse | None

thanks for catching it!

TomeHirata · 2025-10-20T13:38:31Z

dspy/streaming/streaming_listener.py

+                is_last_chunk=self.stream_end,
+            )
+
+    def _default_handle_stream_chunk(self, token: str, end_identifier: str) -> str:


TomeHirata · 2025-10-20T13:40:44Z

dspy/streaming/streaming_listener.py

            elif self.field_end_queue.qsize() > 10:
-                # Buffer could form end identifier, but we've exceeded max buffer size
-                # Yield the oldest token to prevent unbounded buffering
+                # We keep the last 10 tokens in the buffer to avoid sending the DSPy bolilerplate tokens to users.


Let's note that we buffer only if tokens in the queue can form the boilerplate.

TomeHirata · 2025-10-20T13:43:08Z

dspy/streaming/streaming_listener.py

+            return StreamResponse(self.predict_name, self.signature_field_name, token, is_last_chunk=self.stream_end)
+
+        try:
+            parsed = jiter.from_json(


This is interesting, can't we just count the number of { and }?

discussed offline, please see the new implementation for a more robust solution.

TomeHirata · 2025-11-04T08:41:52Z

dspy/streaming/streaming_listener.py

        self.cache_hit = False
        self.allow_reuse = allow_reuse

+        self.json_adapter_state = {"field_accumulated_tokens": ""}


Do we plan to introduce other keys to self.json_adapter_state? or can we flatten the structure?

Copilot

Pull Request Overview

This PR adds support for streaming Pydantic models in JSONAdapter by implementing partial JSON parsing for field detection. Previously, streaming was only supported for string and dspy.Type fields.

Removed the _is_streamable validation check to allow Pydantic models to be streamed
Implemented JSONAdapter-specific streaming logic using the jiter library for partial JSON parsing
Updated test data to correctly format JSON string values with quotes

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
tests/streaming/test_streaming.py	Added comprehensive test coverage for Pydantic model streaming with ChatAdapter and JSONAdapter; corrected test assertions to include proper JSON quote formatting
dspy/streaming/streaming_listener.py	Refactored streaming logic to support Pydantic models via partial JSON parsing; removed streamability validation that blocked Pydantic models

Comments suppressed due to low confidence (1)

tests/streaming/test_streaming.py:1122

The removed assertion at line 1128 assert final_prediction.answer == \"According to the references, water boils at 100°C.\" is no longer present, which means the final prediction answer field is no longer validated. This reduces test coverage for the citations streaming test.

            assert "".join(answer_chunks) == "According to the references, water boils at 100°C."

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dspy/streaming/streaming_listener.py

Copilot · 2025-11-04T08:43:09Z

dspy/streaming/streaming_listener.py

+                # Other adapters rely on the end_identifier to detect the end of the field we are listening to.
+                return self._default_handle_stream_chunk(token, end_identifier)
+
+    def _json_adapter_handle_stream_chunk(self, token: str, chunk_message: str) -> StreamResponse | None:


Mixing implicit and explicit returns may indicate an error, as implicit returns always return None.

dspy/streaming/streaming_listener.py

tests/streaming/test_streaming.py

dspy/streaming/streaming_listener.py

TomeHirata · 2025-11-04T08:56:27Z

dspy/streaming/streaming_listener.py

+                return self._default_handle_stream_chunk(token, end_identifier)
+
+    def _json_adapter_handle_stream_chunk(self, token: str, chunk_message: str) -> StreamResponse | None:
+        self.json_adapter_state["field_accumulated_tokens"] += chunk_message


Why do we accumulate chunk_message instead of token?

field_accumulated_tokens is probably a bad name, renaming to field_accumulated_messages

Basically token is the one we return, chunk_message is the new message we are receiving.

TomeHirata · 2025-11-04T09:02:50Z

dspy/streaming/streaming_listener.py

-                    is_last_chunk=self.stream_end,
+                    self.predict_name, self.signature_field_name, token, is_last_chunk=self.stream_end
                )
+            except Exception:


can't we limit this to be ValueError?

yes, good call!

TomeHirata · 2025-11-04T09:03:54Z

dspy/streaming/streaming_listener.py

+                self.stream_end = True
+                last_token = self.flush()
+
+                keys = list(parsed.keys())


Is parsed.keys ordered based on the key order in the json string?

I think so. It shouldn't affect the logic here though, we just need the key name to cut off the extra characters.

TomeHirata · 2025-11-04T09:05:09Z

dspy/streaming/streaming_listener.py

-    if field_type is str:
-        return True
-    if issubclass(field_type, Type):
-        return field_type.is_streamable()


Shall we delete is_streamable method of Type?

yes good question, I did think about it, and my take is that's still useful to allow streaming listener to hit this part on certain-type fields like Citation. For custom type that's not streamable, it will use the normal streaming handling.

# Handle custom streamable types if self._output_type and issubclass(self._output_type, Type) and self._output_type.is_streamable(): if parsed_chunk := self._output_type.parse_stream_chunk(chunk): return StreamResponse( self.predict_name, self.signature_field_name, parsed_chunk, is_last_chunk=self.stream_end, )

TomeHirata · 2025-11-04T09:06:40Z

dspy/streaming/streaming_listener.py

-                    self.signature_field_name,
-                    token,
-                    is_last_chunk=self.stream_end,
+                    self.predict_name, self.signature_field_name, token, is_last_chunk=self.stream_end


So overall we will return a raw string chunk so the deserialization needs to happen on the caller side?

Yes! It should be pretty simple for the caller to accumulate.

chenmoneygithub added 2 commits September 19, 2025 17:34

support streaming on any type

9b584d8

add decent testing

0454cd5

chenmoneygithub marked this pull request as draft September 20, 2025 01:07

add comments

146f62f

chenmoneygithub marked this pull request as ready for review September 20, 2025 22:44

chenmoneygithub requested review from TomeHirata and okhat September 20, 2025 22:44

chenmoneygithub added 3 commits October 15, 2025 13:29

merge main

45facc4

clean up

67a1095

fix end condition

839b939

TomeHirata reviewed Oct 20, 2025

View reviewed changes

robust finish handling

d562879

chenmoneygithub requested a review from TomeHirata October 27, 2025 17:44

chenmoneygithub added 2 commits October 27, 2025 11:08

comments

43e5fb0

merge main

7714b4f

TomeHirata requested a review from Copilot November 4, 2025 08:40

TomeHirata reviewed Nov 4, 2025

View reviewed changes

Copilot AI reviewed Nov 4, 2025

View reviewed changes

TomeHirata reviewed Nov 4, 2025

View reviewed changes

dspy/streaming/streaming_listener.py Outdated Show resolved Hide resolved

TomeHirata reviewed Nov 4, 2025

View reviewed changes

comments

7ebc89d

Allow stream listener to work on any type #8833

Are you sure you want to change the base?

Allow stream listener to work on any type #8833

Uh oh!

Conversation

chenmoneygithub commented Sep 20, 2025

Uh oh!

TomeHirata Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomeHirata Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chenmoneygithub Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomeHirata Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TomeHirata Oct 20, 2025 •

edited

Loading

TomeHirata Nov 4, 2025 •

edited

Loading

chenmoneygithub Nov 7, 2025 •

edited

Loading

TomeHirata Nov 4, 2025 •

edited

Loading